NOAA SEFSC visual data goes back to 1992, but as shown in the figure below, many predictor variables are only available starting in 2003, therefore earlier visual data is currently excluded from further analyses.
Note: Future work could use monthly climatologies (averages) so that older sightings data could be used. Some dynamic drivers like eddy and front locations would not be able to be considered using that approach.
Visual data predictor variable availability:
The data are split into training and testing sets. In this case, data from 2009 and 2013 are used for testing. Only observations post-2003 are used for modeling due to covariate limitations.
The visual data selected for modeling are displayed on the map below. Data from 2009 were held back for testing. Blue markers indicate HARP locations.
The time series below show the acoustic data used for modeling. Data from 2011 and 2012 used for training, and 2013 data is held back for testing. These density magnitudes are very preliminary,and models only used presence absence data.
Acoustic Timeseries:
Covariates have different distributions across the observations.
Distributions of covariates from acoustic observations:
Distributions of covariates from the visual observations:
Some of these covariates are more or less interrelated. HYCOM estimates of salinity at the surface (HYCOM_SALIN_0) and at 100m (HYCOM_SALIN_100) are very similar, so I selected surface salinity for simplicity. HYCOM current and upwelling estimates at 100m were also removed (HYCOM_MAG_100,HYCOM_UPVEL_100). HYCOM current direction is site specific in the acoustic data, so it was excluded for now.
Remaining correlations are examined in the figure below. Numbers closer to 1 above the diagonal in the figure below represent correlation coefficients. Highly correlated covariates should not be used together in the same model. Day of year was excluded to avoid artifacts associated with the temporal differences of the datasets (Acoustic data are year-round, and visual data are collected in spring/summer).
Covariate Correlations:
Some variables, including chlorophyll, current magnitude, mixed layer depth and distance to fronts are highly skewed and were log-transformed.
Below, the two sets of covariates have been combined and transformed:
To get an idea of the basic predictive power of these covariates, we can look at presence/absence relative to each variable. This also provides an opportunity to look at the range of values observed for each covariate in the visual and acoustic datasets. In the plots below dotted lines indicate the distribution of each covariate when animals were present, and solid lines indicate the distribution when animals were absent. Note that these plots do not account for effort.
Acoustic kernel densities:
Visual kernel densities:
Currently, weights are based on the following logic:
Acoustic data represent presence in 1 day bins,
Acoustic data have a trucation distance of 1.3 km for Cuvier’s beaked whale (95% of group detections occur within this range), for a total monitoring area of \((1.3 km)^{2} \pi\),
Visual data represent presence during ~10 km transect segments traveling at ~10 knots or ~18.5 km/hr. Therefore each visual datapoint represents ~0.54 hours of effort.
Visual trucation distance for Cuvier’s beaked whale is estimated as 5.4 km (see figure below) from these data. ESW is multiplied by transect segment length (usually ~10 km) and doubled to estimate total area swept.
\[\text{Acoustic to Visual ratio} = \frac{24 hrs \cdot (1.3 km)^{2} \pi}{(\text{Segment length}/18.52 km/hr) \cdot 5.4 km \cdot \text{Segment length} \cdot 2} = 127.42/58.32 \approx 2.18:1 \]
Best visual detection probability model:
Zeros in visual dataset are down-weighted by g0 for this species, as estimated by Palka 2007.
## Mean visual data point weight: 57
## Acoustic data point weight: 127
Models were fit using avNNet from the caret package in R.
Run NNs Acoustic only, Visual only, and joint Acoustic/Visual datasets.
Models have the following characteristics:
50 averaged repeats with random node initalization
Include 8 covariates
One hidden layer
Weighted training data
Hidden node layer sizes from 2 to 14 were tested in 2 node increments to search for optimal network size.
## ACOUSTIC ONLY
AcCounter <- 0
f.AcOnly_NN1 <- as.formula(paste("yAcOnly_TF ~", paste(n[model1.indices], collapse = " + ")))
# Iterate over a range of hidden layer sizes between 2 and 14 nodes.
for (layerSize in layerSizeList){
AcCounter <- AcCounter + 1
# put together the formula
# train network
nn_AcOnly[[AcCounter]] <- avNNet(f.AcOnly_NN1, data=AcOnly_train_scaled,
size = layerSize,
repeats = trainRepeats,
na.action = na.omit,
rang = 0.7,
decay = 0.0001,
maxit = 1000,
trace = FALSE)
# predict on train data and estimate Mean Squared Error (MSE)
pr.nn_AcOnly_train[[AcCounter]] <- predict(nn_AcOnly[[AcCounter]],AcOnly_train_scaled[,model1.indices])
MSE.nn_AcOnly_train[[AcCounter]] <- sum((AcOnly_train_scaled$yAcOnly_TF -
pr.nn_AcOnly_train[[AcCounter]])^2)/nrow(AcOnly_train_scaled[model1.indices])
# predict on test data and estimate MSE
pr.nn_AcOnly_test[[AcCounter]] <- predict(nn_AcOnly[[AcCounter]],AcOnly_test_scaled[,model1.indices])
MSE.nn_AcOnly_test[[AcCounter]] <- sum((AcOnly_test_scaled$yAcOnly_TF -
pr.nn_AcOnly_test[[AcCounter]])^2)/nrow(AcOnly_test_scaled[model1.indices])
cat(paste("Done with AcOnly model iteration ",AcCounter, " of ", length(layerSizeList),": Layer Size = ", layerSize, "\n"))
}
## VISUAL ONLY
modelCounter <- 0
# put together the formula
f.VisOnly_NN1 <- as.formula(paste("yVisOnly_TF ~", paste(n[model1.indices], collapse = " + ")))
f.Joint_NN1 <- as.formula(paste("y_TF ~", paste(n[model1.indices], collapse = " + ")))
for (layerSize in layerSizeList){
modelCounter <- modelCounter + 1
# train network
nn_VisOnly[[modelCounter]] <- avNNet(f.VisOnly_NN1, VisOnly_train_scaled,
weights = VisOnly_train_scaled$weightsG0,
size = layerSize,
repeats = trainRepeats,
na.action = na.omit,
rang = 0.7,
decay = 0.0001,
maxit = 10000,
trace = FALSE)
# weights =
# predict on train data and estimate MSE
pr.nn_VisOnly_train[[modelCounter]] <- predict(nn_VisOnly[[modelCounter]],VisOnly_train_scaled[,model1.indices])
MSE.nn_VisOnly_train[[modelCounter]] <- sum(VisOnly_train_scaled$weightsG0*
(VisOnly_train_scaled$yVisOnly_TF -
pr.nn_VisOnly_train[[modelCounter]])^2)/
nrow(VisOnly_train_scaled[model1.indices])
# predict on test data and estimate MSE
pr.nn_VisOnly_test[[modelCounter]] <- predict(nn_VisOnly[[modelCounter]],VisOnly_test_scaled[,model1.indices])
MSE.nn_VisOnly_test[[modelCounter]] <- sum(VisOnly_test_scaled$weightsG0*
(VisOnly_test_scaled$yVisOnly_TF -
pr.nn_VisOnly_test[[modelCounter]])^2)/
nrow(VisOnly_test_scaled[model1.indices])
## JOINT
nn_Joint[[modelCounter]] <- avNNet(f.Joint_NN1, Joint_train_scaled,
weights = Joint_train_scaled$weightsG0,
size = layerSize,
repeats = trainRepeats,
na.action = na.omit,
rang = 0.7,
decay = 0.0001,
maxit = 10000,
trace = FALSE)
pr.nn_Joint_train[[modelCounter]] <- predict(nn_Joint[[modelCounter]] ,Joint_train_scaled[,model1.indices],na.action=na.omit)
MSE.nn_Joint_train[[modelCounter]] <- sum(Joint_train_scaled$weightsG0*(Joint_train_scaled$y_TF -
pr.nn_Joint_train[[modelCounter]])^2)/
nrow(Joint_train_scaled[model1.indices])
pr.nn_Joint_test[[modelCounter]] <- predict(nn_Joint[[modelCounter]],Joint_test_scaled[,model1.indices],na.action=na.omit)
MSE.nn_Joint_test[[modelCounter]] <- sum(Joint_test_scaled$weightsG0*(Joint_test_scaled$y_TF -
pr.nn_Joint_test[[modelCounter]])^2)/
nrow(Joint_test_scaled[model1.indices])
cat(paste("Done with VisOnly and Joint model iteration ",modelCounter, " of ", length(layerSizeList),": Layer Size = ", layerSize, "\n"))
}
Models were compared using a Kolmogorov-Smirnov test to compare predicted and observed presence/absence in the test data.
## [1] "cross entropy scores (lower is better)"
## 1 3 5 7 9 11 13
## Acoustic - Train 5.31 4.86 4.23 3.84 3.28 2.66 2.32
## Acoustic - Test 7.42 7.27 7.13 7.27 7.32 7.08 7.77
## Visual - Train 0.65 0.31 0.18 0.16 0.16 0.16 0.15
## Visual - Test 0.23 0.23 0.23 0.23 0.23 0.23 0.23
## Joint - Train 5.07 3.48 3.05 2.55 2.32 2.00 1.81
## Joint - Test 5.63 5.01 6.06 6.53 5.91 6.99 6.60
For the best model in each category, the importance of each input variable was calculated across the 50 model iterations.
| AcOnly | VisOnly | Joint | |
|---|---|---|---|
| SST | 19.0 | 13.5 | 18.8 |
| SSH | 15.8 | 4.7 | 17.6 |
| log10_CHL | 14.7 | 1.6 | 25.2 |
| log10_HYCOM_MLD | 8.1 | 30.8 | 3.7 |
| HYCOM_SALIN_0 | 17.5 | 8.2 | 13.9 |
| log10_HYCOM_MAG_0 | 8.1 | 5.5 | 6.5 |
| HYCOM_UPVEL_50 | 5.6 | 16.9 | 5.0 |
| EddyDist | 11.2 | 18.8 | 9.3 |
Example network:
Predictions were made on the acoustic test dataset, and compared with actual observations for 2013. The predicted probabliity of encountering animals was compared with the actual weekly rate of occurrence of animals at a site.
CAVEAT: Encounter probability from the data is estimated as fraction of days per week during which this species was detected.
Predicted and observed encounter probabilities at passive acoustic sites using the acoustic-only model (Site order: MC, GC, DT):
Predicted and observed encounter probabilities at passive acoustic sites using the visual-only model (Site order: MC, GC, DT):
Predicted and observed encounter probabilities at passive acoustic sites using the joint model (Site order: MC, GC, DT):
Models were evaluated for summer (July 2009) and winter(January 2009) across the entire Gulf of Mexico (US EEZ beyond the 200m contour).
Summer 2009 predicted distribution and test sightings:
Summer 2009 predicted probability of sighting and test sightings:
Winter 2009 predicted distribution:
Spatial model predictions were generated using oceanographic variables averaged by month between 2003 and 2015.